49 research outputs found

    Efficient shallow learning as an alternative to deep learning

    Full text link
    The realization of complex classification tasks requires training of deep learning (DL) architectures consisting of tens or even hundreds of convolutional and fully connected hidden layers, which is far from the reality of the human brain. According to the DL rationale, the first convolutional layer reveals localized patterns in the input and large-scale patterns in the following layers, until it reliably characterizes a class of inputs. Here, we demonstrate that with a fixed ratio between the depths of the first and second convolutional layers, the error rates of the generalized shallow LeNet architecture, consisting of only five layers, decay as a power law with the number of filters in the first convolutional layer. The extrapolation of this power law indicates that the generalized LeNet can achieve small error rates that were previously obtained for the CIFAR-10 database using DL architectures. A power law with a similar exponent also characterizes the generalized VGG-16 architecture. However, this results in a significantly increased number of operations required to achieve a given error rate with respect to LeNet. This power law phenomenon governs various generalized LeNet and VGG-16 architectures, hinting at its universal behavior and suggesting a quantitative hierarchical time-space complexity among machine learning architectures. Additionally, the conservation law along the convolutional layers, which is the square-root of their size times their depth, is found to asymptotically minimize error rates. The efficient shallow learning that is demonstrated in this study calls for further quantitative examination using various databases and architectures and its accelerated implementation using future dedicated hardware developments.Comment: 26 pages, 4 figures (improved figures resolution

    Enhancing the success rates by performing pooling decisions adjacent to the output layer

    Full text link
    Learning classification tasks of (2^nx2^n) inputs typically consist of \le n (2x2) max-pooling (MP) operators along the entire feedforward deep architecture. Here we show, using the CIFAR-10 database, that pooling decisions adjacent to the last convolutional layer significantly enhance accuracy success rates (SRs). In particular, average SRs of the advanced VGG with m layers (A-VGGm) architectures are 0.936, 0.940, 0.954, 0.955, and 0.955 for m=6, 8, 14, 13, and 16, respectively. The results indicate A-VGG8s' SR is superior to VGG16s', and that the SRs of A-VGG13 and A-VGG16 are equal, and comparable to that of Wide-ResNet16. In addition, replacing the three fully connected (FC) layers with one FC layer, A-VGG6 and A-VGG14, or with several linear activation FC layers, yielded similar SRs. These significantly enhanced SRs stem from training the most influential input-output routes, in comparison to the inferior routes selected following multiple MP decisions along the deep architecture. In addition, SRs are sensitive to the order of the non-commutative MP and average pooling operators adjacent to the output layer, varying the number and location of training routes. The results call for the reexamination of previously proposed deep architectures and their SRs by utilizing the proposed pooling strategy adjacent to the output layer.Comment: 27 pages, 3 figures, 1 table and Supplementary Informatio

    The mechanism underlying successful deep learning

    Full text link
    Deep architectures consist of tens or hundreds of convolutional layers (CLs) that terminate with a few fully connected (FC) layers and an output layer representing the possible labels of a complex classification task. According to the existing deep learning (DL) rationale, the first CL reveals localized features from the raw data, whereas the subsequent layers progressively extract higher-level features required for refined classification. This article presents an efficient three-phase procedure for quantifying the mechanism underlying successful DL. First, a deep architecture is trained to maximize the success rate (SR). Next, the weights of the first several CLs are fixed and only the concatenated new FC layer connected to the output is trained, resulting in SRs that progress with the layers. Finally, the trained FC weights are silenced, except for those emerging from a single filter, enabling the quantification of the functionality of this filter using a correlation matrix between input labels and averaged output fields, hence a well-defined set of quantifiable features is obtained. Each filter essentially selects a single output label independent of the input label, which seems to prevent high SRs; however, it counterintuitively identifies a small subset of possible output labels. This feature is an essential part of the underlying DL mechanism and is progressively sharpened with layers, resulting in enhanced signal-to-noise ratios and SRs. Quantitatively, this mechanism is exemplified by the VGG-16, VGG-6, and AVGG-16. The proposed mechanism underlying DL provides an accurate tool for identifying each filter's quality and is expected to direct additional procedures to improve the SR, computational complexity, and latency of DL.Comment: 33 pages, 8 figure

    Exploring The Design of Prompts For Applying GPT-3 based Chatbots: A Mental Wellbeing Case Study on Mechanical Turk

    Full text link
    Large-Language Models like GPT-3 have the potential to enable HCI designers and researchers to create more human-like and helpful chatbots for specific applications. But evaluating the feasibility of these chatbots and designing prompts that optimize GPT-3 for a specific task is challenging. We present a case study in tackling these questions, applying GPT-3 to a brief 5-minute chatbot that anyone can talk to better manage their mood. We report a randomized factorial experiment with 945 participants on Mechanical Turk that tests three dimensions of prompt design to initialize the chatbot (identity, intent, and behaviour), and present both quantitative and qualitative analyses of conversations and user perceptions of the chatbot. We hope other HCI designers and researchers can build on this case study, for other applications of GPT-3 based chatbots to specific tasks, and build on and extend the methods we use for prompt design, and evaluation of the prompt design

    Current evidence for a modulation of low back pain by human genetic variants

    Get PDF
    The manifestation of chronic back pain depends on structural, psychosocial, occupational and genetic influences. Heritability estimates for back pain range from 30% to 45%. Genetic influences are caused by genes affecting intervertebral disc degeneration or the immune response and genes involved in pain perception, signalling and psychological processing. This inter-individual variability which is partly due to genetic differences would require an individualized pain management to prevent the transition from acute to chronic back pain or improve the outcome. The genetic profile may help to define patients at high risk for chronic pain. We summarize genetic factors that (i) impact on intervertebral disc stability, namely Collagen IX, COL9A3, COL11A1, COL11A2, COL1A1, aggrecan (AGAN), cartilage intermediate layer protein, vitamin D receptor, metalloproteinsase-3 (MMP3), MMP9, and thrombospondin-2, (ii) modify inflammation, namely interleukin-1 (IL-1) locus genes and IL-6 and (iii) and pain signalling namely guanine triphosphate (GTP) cyclohydrolase 1, catechol-O-methyltransferase, μ opioid receptor (OPMR1), melanocortin 1 receptor (MC1R), transient receptor potential channel A1 and fatty acid amide hydrolase and analgesic drug metabolism (cytochrome P450 [CYP]2D6, CYP2C9)

    Abstracts of presentations on plant protection issues at the fifth international Mango Symposium Abstracts of presentations on plant protection issues at the Xth international congress of Virology: September 1-6, 1996 Dan Panorama Hotel, Tel Aviv, Israel August 11-16, 1996 Binyanei haoma, Jerusalem, Israel

    Get PDF

    Cell-based tissue engineering strategies used in the clinical repair of articular cartilage

    Full text link

    An In-Depth Evaluation of Externally Amplified Coupling (EAC) Attacks — a Concrete Threat for Masked Cryptographic Implementations

    No full text
    Masking is a systematic countermeasure to achieve side-channel security for cryptographic algorithms. However, its secure implementation relies on an independence assumption that can be violated by signal coupling. It has been established that coupling induced within a device can be detrimental. It was demonstrated on a 1 st-order secure design (i.e., with two shares) that an adversary who can manipulate the design’s powermeasurement setup can externally induce significant coupling. It can thus concretely reduce the “effective-security-order”, i.e., make 1 st-order leakages as significant as 2 nd-order ones with fewer measurements. This paper explores the impact of such external amplification phenomena on fabricated hardware test cases for the first time. We designed a dedicated ASIC to extend the empirical results for demonstrating impact up to the 4 th order. We have systematically evaluated factors related to adversarial control, e.g., the external measurement resistance. We also investigated their relative influence compared to intra-design ones, i.e., internal power-grid resistance and transistors’ inherent resistance. Our study demonstrates that externally amplified coupling scales up to concrete masked hardware designs with various amounts of shares and is not very sensitive to intra-design parameters. Therefore, providing experimental evidence that such coupling should be considered during masking validation
    corecore